545 research outputs found

    Developing and applying heterogeneous phylogenetic models with XRate

    Get PDF
    Modeling sequence evolution on phylogenetic trees is a useful technique in computational biology. Especially powerful are models which take account of the heterogeneous nature of sequence evolution according to the "grammar" of the encoded gene features. However, beyond a modest level of model complexity, manual coding of models becomes prohibitively labor-intensive. We demonstrate, via a set of case studies, the new built-in model-prototyping capabilities of XRate (macros and Scheme extensions). These features allow rapid implementation of phylogenetic models which would have previously been far more labor-intensive. XRate's new capabilities for lineage-specific models, ancestral sequence reconstruction, and improved annotation output are also discussed. XRate's flexible model-specification capabilities and computational efficiency make it well-suited to developing and prototyping phylogenetic grammar models. XRate is available as part of the DART software package: http://biowiki.org/DART .Comment: 34 pages, 3 figures, glossary of XRate model terminolog

    Cloning and sequence analysis of cDNAs encoding the cytosolic precursors of subunits GapA and GapB of chloroplast glyceraldehyde-3-phosphate dehydrogenase from pea and spinach

    Get PDF
    Chloroplast glyceraldehyde-3-phosphate dehydrogenase (GAPDH) is composed of two different subunits, GapA and GapB. cDNA clones containing the entire coding sequences of the cytosolic precursors for GapA from pea and for GapB from pea and spinach have been identified, sequenced and the derived amino acid sequences have been compared to the corresponding sequences from tobacco, maize and mustard. These comparisons show that GapB differs from GapA in about 20% of its amino acid residues and by the presence of a flexible and negatively charged C-terminal extension, possibly responsible for the observed association of the enzyme with chloroplast envelopes in vitro. This C-terminal extension (29 or 30 residues) may be susceptible to proteolytic cleavage thereby leading to a conversion of chloroplast GAPDH isoenzyme I into isoenzyme II. Evolutionary rate comparisons at the amino acid sequence level show that chloroplast GapA and GapB evolve roughly two-fold slower than their cytosolic counterpart GapC. GapA and GapB transit peptides evolve about 10 times faster than the corresponding mature subunits. They are relatively long (68 and 83 residues for pea GapA and spinach GapB respectively) and share a similar amino acid framework with other chloroplast transit peptides

    Stops making sense: translational trade-offs and stop codon reassignment

    Get PDF
    Background Efficient gene expression involves a trade-off between (i) premature termination of protein synthesis; and (ii) readthrough, where the ribosome fails to dissociate at the terminal stop. Sense codons that are similar in sequence to stop codons are more susceptible to nonsense mutation, and are also likely to be more susceptible to transcriptional or translational errors causing premature termination. We therefore expect this trade-off to be influenced by the number of stop codons in the genetic code. Although genetic codes are highly constrained, stop codon number appears to be their most volatile feature. Results In the human genome, codons readily mutable to stops are underrepresented in coding sequences. We construct a simple mathematical model based on the relative likelihoods of premature termination and readthrough. When readthrough occurs, the resultant protein has a tail of amino acid residues incorrectly added to the C-terminus. Our results depend strongly on the number of stop codons in the genetic code. When the code has more stop codons, premature termination is relatively more likely, particularly for longer genes. When the code has fewer stop codons, the length of the tail added by readthrough will, on average, be longer, and thus more deleterious. Comparative analysis of taxa with a range of stop codon numbers suggests that genomes whose code includes more stop codons have shorter coding sequences. Conclusions We suggest that the differing trade-offs presented by alternative genetic codes may result in differences in genome structure. More speculatively, multiple stop codons may mitigate readthrough, counteracting the disadvantage of a higher rate of nonsense mutation. This could help explain the puzzling overrepresentation of stop codons in the canonical genetic code and most variants

    Actinopolyspora algeriensis sp. nov., a novel halophilic actinomycete isolated from a Saharan soil

    Get PDF
    A halophilic actinomycete strain designated H19T, was isolated from a Saharan soil in the Bamendil region (Ouargla province, South Algeria) and was characterized taxonomically by using a polyphasic approach. The morphological and chemotaxonomic characteristics of the strain were consistent with those of members of the genus Actinopolyspora, and 16S rRNA gene sequence analysis confirmed that strain H19T was a novel species of the genus Actinopolyspora. DNA–DNA hybridization value between strain H19T and the nearest Actinopolyspora species, A. halophila, was clearly below the 70 % threshold. The genotypic and phenotypic data showed that the organism represents a novel species of the genus Actinopolyspora for which the name Actinopolyspora algeriensis sp. nov. is proposed, with the type strain H19T (= DSM 45476T = CCUG 62415T)

    Non-random pre-transcriptional evolution in HIV-1. A refutation of the foundational conditions for neutral evolution

    Get PDF
    The complete base sequence of HIV-1 virus and GP120 ENV gene were analyzed to establish their distance to the expected neutral random sequence. An especial methodology was devised to achieve this aim. Analyses included: a) proportion of dinucleotides (signatures); b) homogeneity in the distribution of dinucleotides and bases (isochores) by dividing both segments in ten and three sub-segments, respectively; c) probability of runs of bases and No-bases according to the Bose-Einstein distribution. The analyses showed a huge deviation from the random distribution expected from neutral evolution and neutral-neighbor influence of nucleotide sites. The most significant result is the tremendous lack of CG dinucleotides (p < 10-50 ), a selective trait of eukaryote and not of single stranded RNA virus genomes. Results not only refute neutral evolution and neutral neighbor influence, but also strongly indicate that any base at any nucleotide site correlates with all the viral genome or sub-segments. These results suggest that evolution of HIV-1 is pan-selective rather than neutral or nearly neutral

    Assessing constancy of substitution rates in viruses over evolutionary time

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Phylogenetic analyses reveal probable patterns of divergence of present day organisms from common ancestors. The points of divergence of lineages can be dated if a corresponding historical or fossil record exists. For many species, in particular viruses, such records are rare. Recently, Bayesian phylogenetic analysis using sequences from closely related organisms isolated at different times have been used to calibrate divergences. Phylogenetic analyses depend on the assumption that the average substitution rates that can be calculated from the data apply throughout the course of evolution. </p> <p>Results</p> <p>The present study tests this crucial assumption by charting the kinds of substitutions observed between pairs of sequences with different levels of total substitutions. Datasets of aligned sequences, both viral and non-viral, were assembled. For each pair of sequences in an aligned set, the distribution of nucleotide interchanges and the total number of changes were calculated. Data were binned according to total numbers of changes and plotted. The accumulation of the six possible interchange types in retroelements as a function of distance followed closely the expected hyperbolic relationship. For other datasets, however, significant deviations from this relationship were noted. A rapid initial accumulation of transition interchanges was frequent among the datasets and anomalous changes occurred at specific divergence levels. </p> <p>Conclusions</p> <p>The accumulation profiles suggested that substantial changes in frequencies of types of substitutions occur over the course of evolution and that such changes should be considered in evaluating and dating viral phylogenies.</p

    Classification and Identification of Bacteria by Mass Spectrometry and Computational Analysis

    Get PDF
    Background: In general, the definite determination of bacterial species is a tedious process and requires extensive manual labour. Novel technologies for bacterial detection and analysis can therefore help microbiologists in minimising their efforts in developing a number of microbiological applications. Methodology: We present a robust, standardized procedure for automated bacterial analysis that is based on the detection of patterns of protein masses by MALDI mass spectrometry. We particularly applied the approach for classifying and identifying strains in species of the genus Erwinia. Many species of this genus are associated with disastrous plant diseases such as fire blight. Using our experimental procedure, we created a general bacterial mass spectra database that currently contains 2800 entries of bacteria of different genera. This database will be steadily expanded. To support users with a feasible analytical method, we developed and tested comprehensive software tools that are demonstrated herein. Furthermore, to gain additional analytical accuracy and reliability in the analysis we used genotyping of single nucleotide polymorphisms by mass spectrometry to unambiguously determine closely related strains that are difficult to distinguish by only relying on protein mass pattern detection. Conclusions: With the method for bacterial analysis, we could identify fire blight pathogens from a variety of biological sources. The method can be used for a number of additional bacterial genera. Moreover, the mass spectrometry approac

    Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    Get PDF
    Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA
    corecore